Feature Selection and Clustering in Software Quality Prediction

نویسندگان

Qi Wang

Jie Zhu

Bo Yu

چکیده

Software quality prediction models use the software metrics and fault data collected from previous software releases or similar projects to predict the quality of software components in development. Previous research has shown that this kind of models can yield predictions with impressive accuracy. However, building accurate software quality prediction model is still challenging for following two reasons. Firstly, the outliers in software data often have a disproportionate effect on the overalls predictive ability of the model. Secondly, not all collected software metrics should be used to construct model because of the curse of dimension. To resolve these two problems, we present a new software quality prediction model based on genetic algorithm (GA) in which outlier detection and feature selection are executed simultaneously. The experimental results illustrate this model performs better than some latest raised software quality prediction models based on S-PLUS and TreeDisc. Furthermore, the clustered software components and selected features are easier for software engineers and data analysts to study and interpret.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Novel Feature Subset Selection Algorithm for Software Defect Prediction

Feature subset selection is the process of choosing a subset of good features with respect to the target concept. A clustering based feature subset selection algorithm has been applied over software defect prediction data sets. Software defect prediction domain has been chosen due to the growing importance of maintaining high reliability and high quality for any software being developed. A soft...

متن کامل

Optimal Feature Selection for Data Classification and Clustering: Techniques and Guidelines

In this paper, principles and existing feature selection methods for classifying and clustering data be introduced. To that end, categorizing frameworks for finding selected subsets, namely, search-based and non-search based procedures as well as evaluation criteria and data mining tasks are discussed. In the following, a platform is developed as an intermediate step toward developing an intell...

متن کامل

Optimal Feature Selection for Data Classification and Clustering: Techniques and Guidelines

متن کامل

Stock Price Prediction using Machine Learning and Swarm Intelligence

Background and Objectives: Stock price prediction has become one of the interesting and also challenging topics for researchers in the past few years. Due to the non-linear nature of the time-series data of the stock prices, mathematical modeling approaches usually fail to yield acceptable results. Therefore, machine learning methods can be a promising solution to this problem. Methods: In this...

متن کامل

Evaluation of Classifiers in Software Fault-Proneness Prediction

Reliability of software counts on its fault-prone modules. This means that the less software consists of fault-prone units the more we may trust it. Therefore, if we are able to predict the number of fault-prone modules of software, it will be possible to judge the software reliability. In predicting software fault-prone modules, one of the contributing features is software metric by which one ...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2007

Feature Selection and Clustering in Software Quality Prediction

نویسندگان

چکیده

منابع مشابه

A Novel Feature Subset Selection Algorithm for Software Defect Prediction

Optimal Feature Selection for Data Classification and Clustering: Techniques and Guidelines

Optimal Feature Selection for Data Classification and Clustering: Techniques and Guidelines

Stock Price Prediction using Machine Learning and Swarm Intelligence

Evaluation of Classifiers in Software Fault-Proneness Prediction

عنوان ژورنال:

اشتراک گذاری